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Significant insights into visual cognition have come from studying real-world perceptual 
expertise. Many have previously reviewed empirical findings and theoretical developments 
from this work. Here we instead provide a brief perspective on approaches, considerations, 
and challenges to studying real-world perceptual expertise. We discuss factors like choosing 
to use real-world versus artificial object domains of expertise, selecting a target domain 
of real-world perceptual expertise, recruiting experts, evaluating their level of expertise, 
and experimentally testing experts in the lab and online. Throughout our perspective, we 
highlight expert birding (also called birdwatching) as an example, as it has been used as a 
target domain for over two decades in the perceptual expertise literature. 
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INTRODUCTION 

In nearly every aspect of human endeavor, we find people who 
stand out for their high levels of skill and knowledge. We call 
them experts. Expertise has been studied in domains ranging 
from chess (Chase and Simon, 1973; Gobet and Charness, 2006; 
Connors and Campitelli, 2014; Leone etal, 2014) to physics 
(Chi etal, 1981) to sports (Baker etal, 2003). Perceptual 
experts, such as ornithologist, radiologists, and mycologists, 
are noted for their remarkable ability to rapidly and accu- 
rately recognize, categorize, and identify objects within some 
domain. Understanding the development of perceptual exper- 
tise is more than characterizing the behavior of individuals with 
uncanny abilities. Rather, if perceptual expertise is the end- 
point of the trajectory of normal visual learning, then studying 
perceptual experts can provide insights into the general princi- 
ples, limits, and possibilities of human learning and plasticity 
(e.g., Gauthier et al, 2010). 

Several reviews have highlighted empirical findings and 
theoretical developments from research on perceptual exper- 
tise in various modalities (for visual expertise, see, e.g., 
McCandliss etal, 2003; Palmeri and Gauthier, 2004; Palmeri and 
Cottrell, 2009; Richler etal., 2011; for auditory expertise, see, 
e.g., Chartrand etal, 2008; Holt and Lotto, 2008; for tactile 
expertise, see, e.g., Behrmann and Ewell, 2003; Reuter etal., 
2012). Here, we instead highlight more practical considera- 
tions that come with studying perceptual expertise; we highlight 
visual expertise because this modality has been most exten- 
sively studied. We specifically consider some choices that face 
researchers: whether to use real-world or artificial objects, what 
domain of perceptual expertise to study, how to recruit par- 
ticipants, how to evaluate their expertise, and whether to test 
in the lab or via the web. Throughout our perspective, we 
use birding as an example domain because it has been com- 
monly used in the literature (e.g., Tanaka and Taylor, 1991; 
Gauthier etal., 2000; Tanaka etal, 2005; Mack etal, 2007; 
Mack and Palmeri, 2011). 



REAL-WORLD vs. ARTIFICIAL DOMAINS OF EXPERTISE 

Expertise-related research has been conducted using both artificial 
and real-world objects. Artificial objects include simple stimuli 
like line orientations, textures, and colors (e.g., Goldstone, 1998; 
Mitchell and Hall, 2014), and relatively complex novel stimuli 
like random dot patterns (Palmeri, 1997), Greebles (Gauthier and 
Tarr, 1997; Gauthier et al, 1998, 1999), and Ziggerins (Wong et al., 
2009a). Real-world objects include birds, dogs, cars, and other 
categories (Tanaka and Taylor, 1991; Gauthier et al, 2000). Studies 
using artificial objects are often training studies, where researchers 
recruit novices and train them to become "experts" in a domain. 
Changes in behavior or brain activity are measured over the course 
of training to understand the development of expertise, making 
these studies longitudinal. The weeks of training used in these 
studies can only be a proxy for the years of experience in real-world 
domains. Because real-world expertise takes so long to develop, 
most real-world studies are cross-sectional. 

An advantage of training studies with artificial objects is 
the power to establish causality. Experimenters have precise 
control over properties of novel objects, relationships between 
them, and how categories are defined (e.g., Richler and Palmeri, 
2014). Participants can be randomly assigned to conditions and 
training and testing can be carefully controlled. As one exam- 
ple, Wong etal. (2009a,b) used novel Ziggerins and trained 
people in two different ways, one of which mirrored individu- 
ation required for face recognition, another of which mirrored 
the letter recognition demands required for reading. Accord- 
ingly, the face-like training group showed behavior and brain 
activity similar to that seen in face recognition while the 
letter-like training group showed behavior and brain activity 
similar to that seen in letter recognition. Studies of artifi- 
cial domains of expertise can provide insights into real-world 
domains. 

If researchers are interested in understanding what makes 
experts experts, not just investigating limits of experience- 
related changes, then it is important to complement carefully 



w ww.f rontiersi n .org 



August 2014 | Volume 5 I Article 857 | 1 



Shen etal 



Real-world perceptual expertise 



1200 i 




600 -I 1 1 ' 

super basic subor 

Levels of Abstraction 

FIGURE 1 | Mean correct categorization response times for a novice 
domain (dogs) and an expert domain (birds) measured online. 

Following Tanaka andTaylor (1991), bird experts were tested in a speeded 
category verification task where they categorized images at the 
superordinate {animal), basic (bird or dog), or subordinate (specific species 
or breed) level. In their novice domain (dogs), a classic basic-level 
advantage was observed, whereby categorization at the basic level was 
significantly faster than the superordinate (f22 = 2.67, p = 0.014) and 
subordinate level K22 = 6.75, p < 0.001). In their expert domain (birds), 
subordinate categorization was as fast as basic-level categorization 
(f22 = 0.81, p = 0.429). This replication was conducted using an online 
Wordpress + Flash custom website with only 23 participants from a single 
short 10 min experimental session. Error bars represent 95% confidence 
intervals on the level x domain interaction. 



controlled laboratory studies using artificial domains with the 
study of real-world experts. Because of their quasi-experimental 
nature - recruiting novices and those with varying levels of exper- 
tise as they occur in the real world - these studies cannot establish 
unambiguous causal relationships between expertise and behav- 
ioral or brain changes. Apart from considerations of external 
validity, studies of real-world experts permit the study of a range 
and extent of expertise that cannot easily be reproduced in the 
laboratory. And practically speaking, testing real- wo rid percep- 
tual experts on real-world perceptual stimuli saves researchers the 
effort and expense needed to train participants in an artificial 
domain. 

Studies using real-world domains also come full circle to inform 
studies using artificial domains. For example, consider the classic 
result of Tanaka and Taylor (1991), reproduced in our own online 
replication in Figure 1 . Bird experts categorized birds (their expert 
domain) and dogs (their novice domain). For novices (Rosch et al., 
1976), objects are categorized faster at a basic level (dog) than a 
superordinate (animal) or subordinate level (blue jay), while for 
experts (Tanaka and Taylor, 1991; Johnson and Mervis, 1997), 
objects are categorized as fast at a subordinate level as a basic level. 
This entry-level shift (Jolicoeur etal, 1984; see also Tanaka etal., 
2005; Mack etal, 2009; Mack and Palmeri, 2011) has been used 
as a behavioral marker of expertise in training studies employing 
artificial domains (Gauthier et al., 2000; Gauthier and Tarr, 2002). 

Our group recently reviewed considerations that factor into 
studies using artificial domains (Richler and Palmeri, 2014), so 
here we focus on real-world domains for the remainder of our 
perspective. 

DOMAINS OF REAL-WORLD PERCEPTUAL EXPERTISE 

In addition to everyday domains of perceptual expertise, like faces 
(Bukach etal, 2006) and letters (McCandliss etal., 2003), stud- 
ies have used domains ranging from cars and birds (Gauthier 
etal., 2000), where expertise is not uncommon, to more spe- 
cialized and sometimes esoteric domains like latent fingerprint 
identification (Busey and Parada, 2010; Dror and Cole, 2010), 
budgie identification (Campbell and Tanaka, 2014), and chick 
sexing (Biederman and Shiffrar, 1987). The particular choice of 
expert domain depends on a combination of theoretical goals and 
practical considerations. 

For example, consider a goal of understanding how the abil- 
ity to categorize at different levels of abstraction changes with 
perceptual expertise (Mack and Palmeri, 2011), which impacts 
understanding of how categories are learned, represented, and 
accessed. Birding is a useful domain because birders must make 
subordinate and sub-subordinate categorizations, sometimes at a 
glance, and often under less than ideal conditions with poor light- 
ing and camouflage. Other kinds of bird experts have different 
skills: budgie experts (a budgerigar is a bred parakeet) can keenly 
identify unique individuals in cages, but need not have expertise 
with other birds, while professional chick sexers can quickly dis- 
criminate male from female genitalia on chicken hatchlings. In 
an entirely different domain, fingerprint experts typically match 
latent prints with a known sample, with both clearly visible, pre- 
sented side by side, and with time limits imposed by the analyst, 
not the environment. 



There are real-world consequences for studying certain 
domains of perceptual expertise, such as latent fingerprint exam- 
ination. Despite the widespread use of forensic evidence - as well 
as its popular depiction on television - a recent National Research 
Council of the National Academy of Sciences (2009) noted a 
"dearth of peer-reviewed, published studies establishing the sci- 
entific bases and validity of many forensic methods," especially 
those methods that require subjective visual pattern analysis and 
expert testimony. That scientific evidence is emerging, especially 
in the case of latent fingerprint expertise (e.g., Busey and Parada, 
2010; Busey and Dror, 2011). 

The choice of domain can also be influenced by various prac- 
tical considerations. It is easier to study perceptual expertise in 
a domain with millions of possible participants than an esoteric 
domain with a few isolated members. It is easier to study a domain 
where relevant stimuli are widely available in books and online. 
And it is easier to study a domain without barriers to contact, 
which can be the case for experts in the military, homeland security, 
and certain professions. For example, studies of expert baggage 
screeners require coordination with the Transportation Security 
Administration (TSA) and many details regarding stimuli and 
procedures cannot be shared with the public (e.g., Wolfe etal., 
2013). In the case of birding, there are millions of people in the 
US alone who consider birding a hobby, spending hours in their 
yards and parks, and billions on books, equipment, and travel (La 
Rouche, 2006). Photos of birds are widely available; books have 
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been published on particularly difficult bird identifications (e.g., 
Kaufman, 1999, 2011). Birders regularly participate in citizen sci- 
ence efforts, such as the Christmas bird count and provide data on 
bird sightings to databases like ebird.org. Anecdotally, this trans- 
lates into a keen interest in science and a willingness to participate 
in research. 

RECRUITING 

In the past, experts usually had to be recruited locally, with 
advertisements posted around a university campus and in local 
newspapers. It may be hard for some to remember that it has only 
been in the past several years that not having an email address has 
become almost equivalent to not having a phone number, and that 
only recently has it become the case that most people have some 
Internet access. Being able to recruit participants more widely via 
the Internet promises not only to increase heterogeneity of par- 
ticipants, but also, and especially relevant for expertise research, 
promises to locate participants with a far greater range of exper- 
tise than might be possible when recruiting in a local geographic 
region. 

One rapidly exploding means of recruiting and testing (see 
"Testing") participants is Amazon Mechanical Turk (AMT). AMT 
allows hundreds of subjects to be easily recruited and tested in a 
matter of days; participants on AMT are more demographically 
diverse than typical American college samples (Buhrmester etal., 
2011). This diversity is important for research examining individ- 
ual differences in perception and cognition. While the potential 
population of AMT workers is large, it is unknown how many 
with high levels of domain expertise might be workers on the plat- 
form. For expertise research, recruitment via AMT may need to be 
supplemented by more direct recruitment of true domain experts 
(e.g., Van Gulick, 2014). 

Large domains of expertise have organizations, web sites, 
blogs, and even tweets and Facebook updates that target par- 
ticular individuals. In principle, online recruiting through these 
channels offers a quick, easy, and inexpensive means of finding 
experts. These could involve paid advertisements online and in 
electronic newsletters. More directly, these could involve mes- 
sages sent to email lists. The biggest challenge to this, however, 
is that many professional organizations or workplaces would 
rarely allow, and many outright prohibit, direct solicitation of 
members or employees, even for basic research; researchers can- 
not directly contact TSA baggage screeners or latent fingerprint 
examiners. By comparison, birding organizations, including local 
Ornithological and Audubon Societies, whose members join 
as part of a hobby, not a profession, can be less restrictive 
in terms of allowing contact with members, so long as con- 
tact is non-intrusive. In our case, we have identified several 
hundred birding groups in the US and Canada, we have con- 
tacted several dozen directly, and have received permission to 
solicit volunteer participants from most, having so far tested 
several hundred birders with a wide range of experience and 
expertise. 

EVALUATING LEVELS OF PERCEPTUAL EXPERTISE 

How do we know someone is a perceptual expert? A simple 
approach relies on subjective self-rating, often supplemented by 



self-report on the amount of formal training, years of expe- 
rience, or community reputation. For example, bird experts 
in Tanaka and Taylor (1991) were recommended by mem- 
bers of bird-watching organizations and had a minimum of 
10 years of experience, and those in Johnson and Mervis 
(1997) led birding field trips and some had careers related to 
birding. 

It is now well-recognized that self-reports of expertise are insuf- 
ficient and that objective measures of expert performance are 
needed (e.g., Ericsson, 2006, 2009); self-report measures of per- 
ceptual expertise are not always good predictors of performance 
(e.g., McGugin etal., 2012; Van Gulick, 2014). Therefore, recent 
work has used quantitative measures to assess expert abilities (e.g., 
see Gauthier et al., 2010). A detailed review and discussion of such 
measures is well beyond the scope of a brief perspective piece. 
A variety of quantitative measures of perceptual expertise have 
been used and new measures are currently being developed - these 
efforts to develop and validate new measures reflect a quickly grow- 
ing interest in exploring individual difference in visual cognition 
(e.g., Wilmer etal, 2010; Gauthier etal, 2013; Van Gulick, 2014). 

While expert-novice differences are sometimes loosely 
described as if they were dichotomous, it is self-evident that 
expertise is a continuum, people vary in their level of exper- 
tise, and any measure of expertise must place individuals along 
a (perhaps multidimensional) continuum. Some behavioral or 
neural markers might distinguish pure novices from those with 
some experience but asymptote at only an intermediate level 
of expertise, while other behavioral or neural markers might 
distinguish the true experts from more middling experts and 
novices. Understanding the continuum of behavioral and brain 
changes, whether they are asymptotic, monotonic, or even 
non-monotonic over the continuum of expertise, can have 
important implications for understanding mechanistically and 
computationally how perceptual expertise develops (e.g., see 
Palmeri etal, 2004). 

Briefly, one useful measure has focused on the perceptual 
part of perceptual expertise: using a simple one-back match- 
ing task, images are presented one at a time and participants 
must say whether consecutive pictures are the same or differ- 
ent. Experts have higher discriminability (d') on images from 
their domain of expertise relative to non-expert domains, and 
this difference predicts behavioral and brain differences (e.g., 
Gauthier etal, 2000; Gauthier and Tarr, 2002). Another mea- 
sure has focused on memory as an index of perceptual expertise: 
the Vanderbilt Expertise Task (VET; McGugin etal., 2012) mir- 
rors aspects of the Cambridge Face Memory task (Duchaine and 
Nakayama, 2006). Participants memorize exemplars from sev- 
eral different artifact and natural categories and then recognize 
other instances under a variety of conditions, and these differ- 
ences in memory within particular domains predict behavioral 
and brain differences (e.g., McGugin etal., 2014). With our inter- 
est in categorization at different levels of abstraction, in work in 
preparation, we have developed a measure that has focused on 
categorical knowledge in perceptual expertise: adapting common 
psychometric approaches, we are refining what could essentially 
be characterized as an Scholastic Assessment Test (SAT, a stan- 
dardized test widely used for college admission in the United 
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States) of birding knowledge, with multiple-choice identifications 
of bird images ranging from easy (common backyard birds like 
the Blue Jay), to intermediate (distinctive yet far less common 
birds, like the Pileated Woodpecker or Great Kiskadee), to quite 
difficult identifications that even fairly expert birders find diffi- 
cult (like discriminating Bohemian from Cedar Waxwing, Hairy 
from Downy Woodpecker, or correctly identifying the many 
extremely similar warblers, sparrows, or flycatchers). Future work 
must consider to what extent different measures of perceptual 
expertise capture the same dimensions of expert knowledge and 
predict the same behavioral and brain measures that vary with 
expertise. 

TESTING 

Laboratory testing allows careful control and monitoring of 
performance, permits experiments that require precisely-timed 
stimulus presentations, and of course allows sophisticated behav- 
ioral and brain measures like eye movements, fMRI, EEG, and 
the like. But laboratory testing incurs a potential cost in that 
the number of laboratory participants is often limited due to the 
expense of subject reimbursement, personnel hours, lab space, 
and equipment. And for any study of unique populations who 
might be geographically dispersed, such as perceptual experts, the 
cost of bringing participants to the laboratory can be prohibitively 
expensive. 

Until fairly recently, the only real methods for testing partic- 
ipants from a wide geographic area, apart from having experi- 
menters or participants travel, was to have the experiments travel. 
For simple studies, this could mean mailed pencil-and-paper tests, 
while for more sophisticated studies, this could mean sending 
disks or CDs to participants to run on a home computer (e.g., 
Tanaka et al., 2010). As anyone who programs well knows, getting 
software to run properly on a wide range of computer hardware 
and operating system versions can be a daunting task. In the past 
few years, it has become popular, and wildly successful, to have 
experiments run via a web browser. While not entirely immune to 
the vagaries of hardware and operating system versions, browser- 
based applications are often more robust to significant variation, 
and can often automatically prompt users for upgrades to requisite 
software plug-ins. 

There are multiple platforms and approaches to online web- 
based experiments. One approach, highlighted earlier, uses AMT. 
In AMT, researchers publish Human Intelligence Tasks (HITs) that 
registered workers can complete in exchange for modest monetary 
compensation. AMT integrates low-level programming tools for 
stimulus creation, test design, and programming into one web- 
based application; other elements in AMT include automated 
compensation, recruitment, and data collection. Aside from the 
availability of these tools, a clear advantage of AMT is the poten- 
tial to recruit from a large and diverse pool of participants. An 
alternative approach is to develop and support a custom web- 
based server for experiments. There are powerful tools for creating 
web pages, such as Wordpress (wordpress.org), and fairly sophis- 
ticated programs can be developed in Adobe Flash or Javascript 
(e.g., De Leeuw, 2014; Simcox and Fiez, 2014). Perhaps an advan- 
tage of such custom portals is that people may be more attracted 
to them because of their interest in participating in research, not 



because of the potential to earn money, as might sometimes be 
the case for AMT. In the end, we suspect that most labs will 
use a combination of both platforms for recruiting, testing, or 
both. 

At least given current computer hardware in wide use, a 
potential vexing problem for web-based experiments is tim- 
ing. Fortunately, platforms such as Flash and Javascript run 
on the local (participant) computer, so properly-designed pro- 
grams can avoid problems that could be introduced by variability 
in Internet connection speeds. Thankfully, reasonable response 
time measurements can be obtained (Reimers and Stewart, 2007; 
Crump etal., 2013; Simcox and Fiez, 2014). Indeed, as illus- 
trated in Figure 1, we have successfully observed differences 
in RTs for expert and novice domains in online experiments 
using a Wordpress + Flash environment that mirror observations 
of expert speeded categorization from classic laboratory studies 
(Tanaka and Taylor, 1991). Unfortunately, the most critical lim- 
itation for now concerns stimulus timing. It is well known that 
LCD monitors in wide use have response characteristics far too 
sluggish to permit the kind of "single-refresh" presentations that 
would have been possible on previous CRTs. While presentation 
times of 100 ms or more are probably a safe bet, anything faster 
would require calibration to check that a participant had a suffi- 
ciently responsive monitor; it may be that the next generation of 
LCD, LED, or other technologies will (hopefully) eliminate these 
limitations. 

SUMMARY 

Most human endeavors have a perceptual component. For exam- 
ple, keen visual perception is required in sports, medicine, science, 
games like chess, and a wide range of skilled behavior. Thus 
research on real-world perceptual expertise has potential theo- 
retical and applied impacts to many domains. Here we briefly 
outlined at least some of the practical considerations that factor 
into research on real-world perceptual expertise. Several of these 
considerations are things that researchers often fret over behind 
the scenes without making it into a typical research publication, 
so in that sense we hope this brief perspective fills a small but 
important hole in the literature. 
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